Making a Map for your Final Report is completely optional. It is only reccommended if you have time and are comfortable with other course content already.
Please use this document as a resource to help you make a map to convey your own ideas. Make sure to list this resource in your reference section if you use it.
Plagiarism Looks like:
If you have been found to plagiarize on the Final Project you may not receive a letter of completion for Data Visualization course as part of the KCDS Certificate.
******************
In Data Visualization Final Project you may be tempted to create a map for your analysis. Well, things are not always straight forward so we are creating this additional ressource to help you if you are being ambitious.
# Importing in your required libraries
import pandas as pd
import altair as alt
alt.data_transformers.enable('default', max_rows=1000000)
import json
Remember that when you export your notebook to an html file, comment out the line alt.data_transformers.enable('data_server') in order for the visualizations to output. Then re-run your entire file, and then go to File > Export Notebook As > HTML.
Let's bring in the data. For reference, this data is a subset of the original data available on The Vancouver Data Portal. The data that we have given you was adapted from the json file and wrangled slightly so that it's easy to use for geographical visualizations.
df = pd.read_csv('vancouver_trees.csv')
df.head()
| std_street | on_street | species_name | neighbourhood_name | date_planted | diameter | street_side_name | genus_name | assigned | civic_number | plant_area | curb | tree_id | common_name | height_range_id | on_street_block | cultivar_name | root_barrier | latitude | longitude | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | W 13TH AV | MAPLE ST | PSEUDOPLATANUS | Kitsilano | NaN | 9.00 | EVEN | ACER | N | 1996 | 10 | Y | 13310 | SYCAMORE MAPLE | 4 | 2900 | NaN | N | 49.259856 | -123.150586 |
| 1 | WALES ST | WALES ST | PLATANOIDES | Renfrew-Collingwood | 2018-11-28 | 3.00 | ODD | ACER | N | 5291 | 7 | Y | 259084 | PRINCETON GOLD MAPLE | 1 | 5200 | PRINCETON GOLD | N | 49.236650 | -123.051831 |
| 2 | W BROADWAY | W BROADWAY | RUBRUM | Kitsilano | 1996-04-19 | 14.00 | EVEN | ACER | N | 3618 | C | Y | 167986 | KARPICK RED MAPLE | 3 | 3600 | KARPICK | N | 49.264250 | -123.184020 |
| 3 | PENTICTON ST | PENTICTON ST | CALLERYANA | Renfrew-Collingwood | 2006-03-06 | 3.75 | EVEN | PYRUS | N | 2502 | 5 | Y | 213386 | CHANTICLEER PEAR | 1 | 2500 | CHANTICLEER | Y | 49.261036 | -123.052921 |
| 4 | RHODES ST | RHODES ST | GLYPTOSTROBOIDES | Renfrew-Collingwood | 2001-11-01 | 3.00 | ODD | METASEQUOIA | N | 5639 | N | Y | 189223 | DAWN REDWOOD | 2 | 5600 | NaN | N | 49.233354 | -123.050249 |
Now we can use this data and make many different visualization (with additional wrangling) but this resource is here to help explain how we will make maps using this data.
Since Altair does not make Vancouver easy to locate on the global map and there is no projection for Canada like there is for the United states, we've made the geojson for Vancouver and it's neighbourhoods available through a url. This was obtain from the Vancouver Data Portal once again.
To make a base map of Vancouver we use the geojson url saved in url_geojson.
url_geojson = 'https://raw.githubusercontent.com/UBC-MDS/exploratory-data-viz/main/data/local-area-boundary.geojson'
Next, we must format it in a Topo json format which we convert using alt.Data().
data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features',type='json'))
data_geojson_remote
Data({
format: DataFormat({
property: 'features',
type: 'json'
}),
url: 'https://raw.githubusercontent.com/UBC-MDS/exploratory-data-viz/main/data/local-area-boundary.geojson'
})
We can then make our base Vancouver Altair map using the data_geojson_remote object as we've made maps in the past except this time we need to use an identity type and we need to reflectY=True. Without this second argument our map of Vancouver is upside down.
vancouver_map = alt.Chart(data_geojson_remote).mark_geoshape(
color = 'gray', opacity= 0.5, stroke='white').encode(
).project(type='identity', reflectY=True)
vancouver_map
Nice, we have a base map of Vancouver! 🎉
Now all we have to do is combine this with some of our tree data.
Let's plot the median diameter of the tree trunks for each neighbourhood.
I'm going to rename neighbourhood_name to name since that's what's it's called in the geojson url and we need to connect the two dataframes using the function transform_lookup().
I'm also going to select the median latitude and longitude columns for each neighbourhood as I'm going to make a point map using these coordinates after.
median_df = df.groupby('neighbourhood_name'
).median().reset_index(
).rename(columns={'neighbourhood_name':'name'})[['name',
'diameter',
'latitude',
'longitude']]
median_df
| name | diameter | latitude | longitude | |
|---|---|---|---|---|
| 0 | Arbutus-Ridge | 10.000 | 49.248710 | -123.162175 |
| 1 | Downtown | 7.000 | 49.279975 | -123.119297 |
| 2 | Dunbar-Southlands | 12.000 | 49.245220 | -123.184440 |
| 3 | Fairview | 10.500 | 49.263544 | -123.130863 |
| 4 | Grandview-Woodland | 10.000 | 49.272469 | -123.064278 |
| 5 | Hastings-Sunrise | 10.000 | 49.274245 | -123.043578 |
| 6 | Kensington-Cedar Cottage | 10.000 | 49.246389 | -123.074296 |
| 7 | Kerrisdale | 11.000 | 49.227908 | -123.153492 |
| 8 | Killarney | 8.750 | 49.222751 | -123.037832 |
| 9 | Kitsilano | 13.000 | 49.264121 | -123.162942 |
| 10 | Marpole | 9.000 | 49.212933 | -123.131384 |
| 11 | Mount Pleasant | 11.625 | 49.261609 | -123.098054 |
| 12 | Oakridge | 9.000 | 49.227719 | -123.126659 |
| 13 | Renfrew-Collingwood | 7.750 | 49.246218 | -123.040136 |
| 14 | Riley Park | 10.000 | 49.247340 | -123.102256 |
| 15 | Shaughnessy | 12.500 | 49.246229 | -123.141200 |
| 16 | South Cambie | 9.650 | 49.249015 | -123.120555 |
| 17 | Strathcona | 9.000 | 49.278289 | -123.090066 |
| 18 | Sunset | 9.500 | 49.221793 | -123.092664 |
| 19 | Victoria-Fraserview | 9.000 | 49.221164 | -123.063417 |
| 20 | West End | 11.000 | 49.286253 | -123.134223 |
| 21 | West Point Grey | 12.000 | 49.263790 | -123.205870 |
This now gives us the median lat and long coordinates as well as tree trunk diameter per Vancouver neighbourhood.
Now we link the shape file with the median tree dataframe using lookups like we learned in Module 6.
The neighbourhood is stored in the properties field, which we can access using properties.name.
We then grab the diameter and name column from median_df using LookupData().
We color the neighbourhoods based on trunk diameter size.
alt.Chart(data_geojson_remote).mark_geoshape().transform_lookup(
lookup='properties.name',
from_=alt.LookupData(median_df, 'name', ['diameter', 'name'])).encode(
color='diameter:Q',
tooltip='name:N').project(type='identity', reflectY=True)
Look at that! We have a chloropleth map!
We learned that these can sometimes be a bit deceiving, so we can instead use point size to show diameter size instead.
points = alt.Chart(median_df).mark_circle().encode(
longitude='longitude',
latitude='latitude',
size='diameter:Q',
color = 'diameter:Q',
tooltip='name').project(type= 'identity', reflectY=True)
points
And overlay it on our base Vancouver map.
(vancouver_map + points).configure_view(stroke=None)
In addition we could plot all the trees in the dataset using the latitude and longitude of each row/tree in the full dataframe.
points = alt.Chart(df).mark_circle(size=1, color='green').encode(
longitude='longitude',
latitude='latitude').project(type= 'identity', reflectY=True)
(vancouver_map + points).configure_view(stroke=None)